Speech Recognition for Learning

The National Center for Technology Innovation
Speech Recognition for Learning

Speech recognition, also referred to as speech-to-text or voice recognition, is technology that recognizes speech, allowing voice to serve as the "main interface between the human and the computer"i. This Info Brief discusses how current speech recognition technology facilitates student learning, as well as how the technology can develop to advance learning in the future.

Although speech recognition has a potential benefit for students with physical disabilities and severe learning disabilities, the technology has been inconsistently implemented in the classroom over the years. As the technology continues to improve, however, many of the issues are being addressed. If you haven't used speech recognition with your students lately, it may be time to take another look. Both Microsoft and Apple have built speech recognition capabilities into their operating systems, so you can easily try out these features with your students to find out whether speech recognition might be right for them.

Speech recognition vs. speech-to-text: what's the difference?

When researching speech recognition tools for your child or your classroom, you may variously see technologies referred to as "speech-to-text," "voice recognition," or "speech recognition," sometimes all within the same product description. Though the terms can be confusing, they all refer to technologies that can translate spoken language into digitized text or turn spoken commands into actions (i.e., "open Microsoft Word"). Voice recognition can refer to products that need to be trained to recognize a specific voice (such as Dragon Naturally Speaking), or those products used in applications like automated call centers that are capable of recognizing a limited vocabulary from any user. Quite frequently, as in this article, the terms speech recognition and voice recognition are used interchangeably.

Speech recognition technology in everyday life

Speech recognition and speech-to-text programs have a number of applications for users with and without disabilities. Speech-to-text has been used to help struggling writers boost their writing productionii and to provide alternate access to a computer for individuals with physical impairmentsiii. Other applications include speech recognition for foreign language learning,iv voice activated products for the blind,v and many familiar mainstream technologies.

New developments in the technology have driven innovation in many familiar customer service industry applications. We have all used voice recognition technologies in our daily lives, many times without even thinking about it: automated phone menus and directories, voice activated dialing on our cell phones, and integrated voice commands on Smartphones are just a few examplesvi. Medical and law professionals use voice recognition every day to dictate notes and transcribe important information. Newer uses of the technology include military applications, navigation systems, automotive speech recognition (Ford SYNC), 'smart' homes designed with voice command devices, and video games such as EndWar, which allows the player to give orders to their troops using only their voice.

Benefits of speech recognition for struggling writers

Populations that may benefit from speech recognition technologies for learning include users with:

  • Learning disabilities, including dyslexia and dysgraphia
  • Repetitive strain injuries, such as carpal tunnel syndrome
  • Poor or limited motor skills
  • Vision impairments
  • Physical disabilities
  • Limited English Languagevii

Benefits for students with disabilities may include improved access to the computer, increases in writing production, improvements in writing mechanics, increased independence, decreased anxiety around writing, and improvements in core reading and writing abilities.

Improved access

For students with motor skill limitations, physical disabilities, blindness/low vision, or other difficulties accessing a standard keyboard and mouse, hands-free computing through the use of speech recognition technologies may be beneficial. By removing the physical barriers to writing and navigation of the computer, you can increase student access to technology and classroom activities.

Writing production

For students with learning disabilities, speech recognition technology can encourage writing that is more thoughtful and deliberateviii. Studies with middle and high school students with learning disabilities have shown that input via speech is less challenging and that students frequently generate papers that are longer and better quality using speech recognition technologiesix.

Mechanics of writing

Speech recognition technologies, in conjunction with word processors' abilities, can help reduce some of the difficulties that students may face with writing mechanics. Because students can often write more quickly with speech recognition tools, it eliminates potential obstacles, such as difficulty with handwriting or the need to transcribe thoughts while brainstorming. Often, writers with learning disabilities will skip over words when they are unsure of the correct spelling, leading to pieces of writing that are short, missing key elements, or not reflective of the student's true abilitiesx. Speech recognition and word processors can potentially alleviate some of these concerns by allowing the student to get their thoughts out on paper without worrying about these or other technical writing components xi.

Increased independence

For students with physical disabilities, poor motor skills or learning disabilities, a human transcriber is a low-tech solution for the classroom that allows the focus to shift from the physical act of writing to expressing thoughts and knowledge. However, a transcriber makes the student dependent upon a teacher or aide for writing tasks. Students who use transcribers for writing often report "spending less time planning and organizing because they felt they were keeping the transcriber waiting, or felt embarrassment about making mistakes or asking for multiple readings of what was written."xii Using speech-to-text tools can allow the student to be more independent in their writing and other academic activities. If the speech-to-text program also includes text-to-speech features, the student may hear their text read aloud to them multiple times, and correct their errors more independently.

Decreased anxiety

In addition to allowing the student to work in a more independent manner, speech recognition can allow students to write without fear of spelling errors, helping them avoid the anxieties associated with mechanics, organization, and editingxiii; many struggling writers feel embarrassment about "the appearance of their writing due to brevity of sentence or paragraph length, illegibility of handwriting, and/or misspelled words."xiv

For students who are English Language Learners, or are learning a second language, speech recognition programs can allow them to practice pronunciation in a safe, low-stress environment. Students can engage in multiple repetitions of an unfamiliar word without worrying about feeling embarrassedxv. Some popular foreign language software programs now include speech recognition features for just this purpose.

Improvements in core reading and writing abilities

Research has shown that speech recognition tools can also serve a remedial function for students with learning disabilities in the areas of reading and writing. In allowing students to see the words on screen as they dictate, students can gain insight into important elements of phonemic awareness, such as sound-symbol correspondence. As students speak and see their words appear on the screen, the speech-to-text tool directly demonstrates the relationship between how a word looks and soundsxvi. This bimodal presentation of text can be especially helpful for students with learning disabilities, and is thought to be why speech recognition has been found effective in remediating reading and spelling deficits.

Another key benefit of speech recognition technologies is the error correction process. Because no speech recognition product is completely accurate, "it requires users to check the accuracy of each word uttered as sentences are being dictated. When an error is made, the child must then find the correct word among a list of similar words and choose it"xvii. This process necessitates that the user examine the word list closely, compare words that look or sound alike, and make decisions about the best word for the specific situation. This can give kids with LD a boost in reading and spelling as they learn to discriminate between similar wordsxviii.

Challenges

Despite advances over the past 20 years, speech recognition technology as it is today still presents challenges for students with disabilities. As with any new technology tool, students must initially become comfortable with using speech-to-text, including training it to recognize their voices, gaining experience with a new way of writing, understanding the differences between writing and speaking, and correcting errors within the text. For students with learning disabilities, struggling readers and writers, or very young students, this may induce additional frustrations with the writing process. Though the software has improved, speech-to-text programs are not always capable of recognizing the voices of young children, so students must adjust to speaking more slowly so that the technology can more accurately transcribe their thoughtsxix.

Because speaking to write is an activity that requires different skills than speaking in conversation, students must be aware of the differences between the two. This may be challenging for early writers who have not yet made that distinction. Using speech recognition technology may make it more difficult for younger students to begin differentiating between writing and speaking. Thus, it is critical that use of speech recognition technology be paired with instruction on writing strategies, brainstorming, drafting and organizationxx.

Another key element involved in using speech recognition programs is the need for error correction and monitoring of misrecognized words. Newer programs never make a spelling mistake and they improve when users correct misrecognized words, so students must be alert for errors that go unrecognized by the program (e.g., incorrect word choices, or words misunderstood by the software). While this process can be taxing for struggling readers, a program that is also capable of reading text back to the user can help them with editing and revising.

Another implementation challenge is that the software requires a good deal of memory and must be saved on a single server folder. These voice files improve in accuracy with use, so it is important that students work in their own saved file. This means that this assistive technology is not always portable. Schools have overcome this challenge by assigning students laptops with the software installed or storing files on a networked server that can be accessed from anywhere on campus.

As with any assistive technology solution, finding funding for speech recognition solutions may also present a challenge for schools. The first step in obtaining any assistive technology for your students is to conduct a thorough assessment to determine what would best meet the student's learning needs. It may be that because of the various implementation challenges listed above, speech recognition software would not be the best fit for certain students. Once a potentially beneficial solution is agreed upon, there are many options for schools looking for funding for AT, from grant programs, to used AT marketplaces, to loan programs from vendors and assistive technology centers.

Improving student success with speech recognition

Speech recognition isn't perfect and may not be the best choice for all students with disabilities, but it does have some significant benefits for certain students that make it worth the time investment. If speech recognition tools are right for your student, here are several tips for improving student success:

  • Be sure that your computer has a good quality microphone and sound card, and meets the minimum memory capacity and processing speed requirements listed for the software you purchase. Many speech recognition companies will recommend specific microphones that have been shown to work well with their software.
  • For students with LD using speech recognition, explicit instruction in reading skills, phonological awareness, writing strategies and organizational strategies may be helpful.
  • For students who struggle with reading, picking out software that includes a read-back or text-to-speech feature can help with error correction and editing.

The future of speech recognition

More research is still needed on the efficacy of speech recognition for children with LD and other types of disabilities. However, the technology is continuing to move forward and address many of the problems encountered before. For example, many newer versions of speech recognition software now include voice profiles for children, meaning that they are becoming more accurate at distinguishing words spoken by younger users.

As industries begin to use some elements of voice recognition technology in their day-to-day work (military, medical, legal)xxi, it makes sense for students to gain some familiarity with speech recognition. Some technologies initially designed for users with disabilities have seen transitions into mainstream technology, becoming something that we all come to rely on in our daily lives.xxii Because of this, technology industry leaders are beginning to believe that all students should receive a technology education that reflects the future of human-computer interactions, which they predict will be primarily through voice and touch.

Footnotes

iNuance Communications. (2009). Dragon NaturallySpeaking: Helping all students reach their full potential. March 2009 White Paper, Nuance Communications.

iiHiggins, E.L., & Raskind, M.H. (2000). Speaking to read: The effects of continuous vs. discrete speech recognition systems on the reading and spelling of children with learning disabilities. Journal of Special Education Technology, 15(1), 19-30; Higgins, E.L. & Raskind, M.H. (1995). Compensatory effectiveness of speech recognition on the written composition performance of postsecondary students with learning disabilities. Learning Disability Quarterly, 18(2), 159-174; MacArthur, C.A. (2009). Reflections on research on writing and technology for struggling writers. Learning Disabilities Research & Practice, 24(2), 93-103.

iiiBruce, C. Edmundson, A., Coleman, M. (2003). Writing with voice: an investigation of the use of a voice recognition system as a writing aid for a man with aphasia. International Journal of Language & Communication Disorders, 38(2), 131-148.

ivChiu, T.L., Liou, H.C., Yeh, Y. (2007). A study of web-based oral activities enhanced by automatic speech recognition for EFL college learning. Computer Assisted Language Learning, 20(3), 209-233; Jones, G., Squires, T., Hicks, J. (2007-2008). Combining speech recognition/natural language processing with 3D online learning environments to create distributed authentic and situated spoken language learning. Journal of Educational Technology Systems, 36(4), 375-392.

vFreitas, D., Kouroupetroglou, G. (2008). Speech technologies for blind and low vision persons. Technology and Disability, 20, 135-156.

viJana, R. (2009). How tech for the disabled is going mainstream. BusinessWeek, 4149, 58-60. ; Lohr, S, & Markoff, J. (2010, June 24). Smarter than you think: computers learn to listen, and some talk back. New York Times, Retrieved from http://www.nytimes.com/2010/06/25/science/25voice.html?ref=science; Monaghan, P. (2010). Design for disability will become the norm. Chronicle of Higher Education, 56(2), B6-B7.

viiBruce, C. Edmundson, A., Coleman, M. (2003). Writing with voice: an investigation of the use of a voice recognition system as a writing aid for a man with aphasia. International Journal of Language & Communication Disorders, 38(2), 131-148.; Chiu, T.L., Liou, H.C., Yeh, Y. (2007). A study of web-based oral activities enhanced by automatic speech recognition for EFL college learning. Computer Assisted Language Learning, 20(3), 209-233; Freitas, D., Kouroupetroglou, G. (2008). Speech technologies for blind and low vision persons. Technology and Disability, 20, 135-156; Gardner, T.J. (2008). Speech recognition for students with disabilities in writing. Physical Disabilities: Education and Related Services, 26(2), 43-53; Higgins, E.L., Raskind, M.H. (2004). Speech recognition-based and automaticity programs to help students with severe reading and spelling problems. Annals of Dyslexia, 54(2), 365-392; Honeycutt, L. (2003). Researching the use of voice recognition writing software. Computers and Composition, 20, 77-95; Jones, G., Squires, T., Hicks, J. (2007-2008). Combining speech recognition/natural language processing with 3D online learning environments to create distributed authentic and situated spoken language learning, Journal of Educational Technology Systems, 36(4), 375-392; MacArthur, C.A. (2009). Reflections on research on writing and technology for struggling writers. Learning Disabilities Research & Practice, 24(2), 93-103.

viiiHiggins, E.L., Raskind, M.H. (2004). Speech recognition-based and automaticity programs to help students with severe reading and spelling problems. Annals of Dyslexia, 54(2), 365-392; Honeycutt, L. (2003). Researching the use of voice recognition writing software. Computers and Composition, 20, 77-95.

ixGardner, T.J. (2008). Speech recognition for students with disabilities in writing. Physical Disabilities: Education and Related Services, 26(2), 43-53.; Higgins, E.L., Raskind, M.H. (2004). Speech recognition-based and automaticity programs to help students with severe reading and spelling problems. Annals of Dyslexia, 54(2), 365-392; Nuance Communications. (2009). Dragon NaturallySpeaking: Helping All Students Reach Their Full Potential. March 2009 White Paper,Nuance Communications.

xGardner, T.J. (2008). Speech recognition for students with disabilities in writing. Physical Disabilities: Education and Related Services, 26(2), 43-53.

xiHiggins, E.L., Raskind, M.H. (2004). Speech recognition-based and automaticity programs to help students with severe reading and spelling problems. Annals of Dyslexia, 54(2), 365-392.; Nuance Communications. (2009). Dragon NaturallySpeaking: Helping All Students Reach Their Full Potential. March 2009 White Paper; Nuance Communications.; Peterson-Karlan, G., Hourcade, J., & Parette, P. (2008). A review of assistive technology and writing skills for students with physical and educational disabilities. Physical Disabilities: Education and Related Services, 26 (2), 13-32.

xiiCaverly, D.C. (2008). Techtalk: Assistive technology for writing. Journal of Developmental Education, 31(3), 36-37.

xiiiGardner, T.J. (2008). Speech recognition for students with disabilities in writing. Physical Disabilities: Education and Related Services, 26(2), 43-53.

xivGardner, T.J. (2008). Speech recognition for students with disabilities in writing. Physical Disabilities: Education and Related Services, 26(2), 44.

xvChiu, T.L., Liou, H.C., Yeh, Y. (2007). A study of web-based oral activities enhanced by automatic speech recognition for EFL college learning. Computer Assisted Language Learning, 20(3), 209-233.

xviNuance Communications 2009; Silver-Pacuilla, H. (2006). Access and benefits: Assistive technology in adult literacy. Journal of Adolescent and Adult Literacy 50(2),114-25.

xviiHiggins, E.L., Raskind, M.H. (2004). Speech recognition-based and automaticity programs to help students with severe reading and spelling problems. Annals of Dyslexia, 54(2), 365-392.

xviiiNuance Communications. (2009). Dragon NaturallySpeaking: Helping all students reach their full potential. March 2009 White Paper, Nuance Communications.

xixHoneycutt, L. (2003). Researching the use of voice recognition writing software. Computers and Composition, 20, 77-95.

xxHiggins, E.L., & Raskind, M.H. (2000). Speaking to read: The effects of continuous vs. discrete speech recognition systems on the reading and spelling of children with learning disabilities. Journal of Special Education Technology, 15(1), 19-30.

xxiLohr, S, & Markoff, J. (2010, June 24). Smarter than you think: Computers learn to listen, and some talk back. New York Times. Retrieved from http://www.nytimes.com/2010/06/25/science/25voice.html?ref=science.

xxiiJana, R. (2009). How tech for the disabled is going mainstream. BusinessWeek, 4149, 58-60; Monaghan, P. (2010). Design for disability will become the norm. Chronicle of Higher Education, 56(2), B6-B7.

Posted on BrainLine December 6, 2010.

An "Info Brief" from the National Center for Technology Innovation (NCTI), 2010. Used with permission.